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Abstract 

Background: Life depends on biopolymer sequences as catalysts and as genetic material. A key step in the Origin 
of Life is the emergence of an autocatalytic system of biopolymers. Here we study computational models that 
address the way a living autocatalytic system could have emerged from a non-living chemical system, as envisaged 
in the RNA World hypothesis. 

Results: We consider (i) a chemical reaction system describing RNA polymerization, and (ii) a simple model of 
catalytic replicators that we call the Two's Company model. Both systems have two stable states: a non-living state, 
characterized by a slow spontaneous rate of RNA synthesis, and a living state, characterized by rapid autocatalytic 
RNA synthesis. The origin of life is a transition between these two stable states. The transition is driven by stochastic 
concentration fluctuations involving relatively small numbers of molecules in a localized region of space. These 
models are simulated on a two-dimensional lattice in which reactions occur locally on single sites and diffusion 
occurs by hopping of molecules to neighbouring sites. 

Conclusions: If diffusion is very rapid, the system is well-mixed. The transition to life becomes increasingly difficult 
as the lattice size is increased because the concentration fluctuations that drive the transition become relatively 
smaller when larger numbers of molecules are involved. In contrast, when diffusion occurs at a finite rate, 
concentration fluctuations are local. The transition to life occurs in one local region and then spreads across the 
rest of the surface. The transition becomes easier with larger lattice sizes because there are more independent 
regions in which it could occur. The key observations that apply to our models and to the real world are that the 
origin of life is a rare stochastic event that is localized in one region of space due to the limited rate of diffusion of 
the molecules involved and that the subsequent spread across the surface is deterministic. It is likely that the time 
required for the deterministic spread is much shorter than the waiting time for the origin, in which case life evolves 
only once on a planet, and then rapidly occupies the whole surface. 

Reviewers: Reviewed by Omer Markovitch (nominated by Doron Lancet), Claus Wilke, and Nobuto Takeuchi 
(nominated by Eugene Koonin). 



Background 

When we consider the network of biochemical reactions 
that exists inside a living organism, it is important to 
realize that these reactions are in a dynamical steady 
state, rather than at thermodynamic equilibrium. The re- 
action system is maintained out of equilibrium because 
there is a continual input of energy and material (food) 
and because the rates of reactions are controlled by 
enzyme catalysts that permit desired reactions to occur 
much more rapidly than they would in a non-living 

* Correspondence: higgsp@mcmaster.ca 

Origins Institute and Department of Physics and Astronomy, McMaster 
University, Hamilton, ON L8S 4M1, Canada 



mixture of molecules. The presence of biopolymer cata- 
lysts is an essential feature that distinguishes living sys- 
tems from non-living chemical systems. Another key 
feature of living systems is that they are autocatalytic, 
i.e. they can reproduce and make more of their own 
components. Once again, it is biopolymers that allow 
organisms to do this. Cells use biopolymers to store the 
genetic information that allows them to produce the 
required catalysts for metabolism and to enable replica- 
tion of the genetic polymers themselves. A commonly 
used definition of life is that it is a self-sustained system 
capable of undergoing Darwinian evolution [1]. As it 
is biopolymers that allow life to be self-sustained and 
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enable heredity and evolution, we take the view that it is 
the emergence of autocatalytic biopolymer systems that 
is the key step in the origin of life. Thus, our aim is to 
understand how an autocatalytic biopolymer system 
could have emerged from a non-living chemical system. 

It is usually thought that the interdependent system 
of DNA, RNA and proteins that sustains todays cells is 
too complex to have arisen de novo. A prime candidate 
for a simpler biopolymer system that could have existed 
in early organisms is the RNA world hypothesis [2-5], 
which envisages that RNA sequences played both the 
genetic and catalytic roles. This is supported by the fact 
that RNA is the key component of the ribosome and by 
a large number of experimental studies of ribozymes 
in vitro [6-14]. In stronger forms of the RNA world hy- 
pothesis, it is argued that RNA sequences were the first 
autocatalytic biopolymers, and hence the first living 
system, rather than just an intermediate step between 
the origin of life and current life. It is then necessary 
to demonstrate that formation of functional RNAs was 
possible by non-living chemistry. Progress continues to 
be made on mechanisms of prebiotic synthesis of nucleo- 
tides and RNA oligomers [15-22]. In our view, the case 
that the origin of life occurred via an RNA world is quite 
strong, but it is still far from proven. Several authors have 
emphasized the difficulties associated with the RNA 
world and have considered alternative scenarios [23,24]. 
A review discussing why the RNA world is currently 
the best theory we have for the origin of life, despite 
acknowledged limitations, appeared very recently [25]. 

Although we will consider RNA world scenarios in this 
paper, the two most important points that we emphasize 
here are sufficiently general to apply if some other kind 
of replicating molecular system arose prior to the RNA 
world or instead of it. The first point is that the non- 
living and living states are two alternative dynamically 
stable states of the same chemical system. The second 
point is that the origin of life is a stochastic transition 
between these two states that is initiated by concentra- 
tion fluctuations involving relatively small numbers of 
molecules in a localized spatial region. 

We previously showed that two alternative dynamic- 
ally stable states exist in a chemical reaction system that 
models RNA polymerization [26]. The model calculates 
the concentrations of monomers and polymers of differ- 
ent lengths. Monomer synthesis and polymerization 
reactions can occur at a small spontaneous rate in ab- 
sence of ribozymes. Polymerization can also occur at a 
rapid rate due to catalysis by ribozymes, if these are 
present. In one stable state, which we call dead or non- 
living, the ribozyme concentration is negligible and the 
reactions proceed at their spontaneous rates only. In the 
other stable state, which we call living, there is a sub- 
stantial ribozyme concentration and the polymerization 



rate is high, which causes autocatalytic synthesis of more 
ribozymes. We calculated the phase diagram as a func- 
tion of the rate parameters for spontaneous and cata- 
lyzed reactions. We showed that there is a region where 
the reaction system is bistable {i.e. both solutions exist 
simultaneously) and also regions where only one or other 
of the two solutions is stable. We subsequently showed 
that a similar phase diagram exists for RNA systems with 
nucleotide synthase and polymerase ribozymes instead 
of polymerases [27] and in asymmetrical autocatalytic 
systems describing the origin of homochiral biopolymers 
[28]. In this paper, we will argue that the form of the 
phase diagram is a generic feature of models describing 
autocatalysis and the origin of life, and we will present a 
very straightforward model that shows this in its simplest 
and clearest form. 

It is important that the real world must be in the 
bistable region of the phase diagram. The living state 
must exist in the real world, but it cannot be the only 
solution, because the origin of life is a difficult and rare 
event. We do not see spontaneously replicating systems 
popping up easily in every test tube or puddle. The exist- 
ence of two stable states is also demonstrated by the fact 
that it is easy to kill an organism by depriving it of some 
required substance, but the organism is not easily 
brought back to life if the substance is resupplied {e.g. an 
organism that died of suffocation is not brought back to 
life by giving it oxygen). We presume that the world 
began in the non-living state, and that it was possible, 
but difficult, for a transition to occur to the living state. 
The question is how the transition could have occurred. 

We previously argued that stochastic concentration 
fluctuations are essential for the origin of life [26]. In a 
large, well-mixed chemical system, we can write down 
deterministic ordinary differential equations to describe 
the concentrations of different types of molecule. The 
non-living system is dynamically stable for ever in such 
a deterministic case. In a system of finite volume, with 
finite numbers of molecules, stochastic fluctuations can 
occur that allow the system to jump from the non-living 
to the living state. In the RNA polymerization model, it 
is presumed that only fairly long RNA sequences have any 
chance of being functional ribozymes. Such sequences are 
very rare, and may only be present in a handful of copies, 
even though monomers and short oligomers may be 
common. Stochastic fluctuations in the ribozyme con- 
centration are thus important, even if the monomer con- 
centration is effectively deterministic. 

In this paper, we want to investigate more carefully 
the factors that determine the time required for the sto- 
chastic transition to occur. We observed previously that 
the transition to life occurs most easily in systems of 
intermediate size [26]. If the system is too small, it is 
very difficult to create sequences long enough to be 
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ribozymes. If the system is too large, the concentration 
fluctuations are too small to permit the transition to 
occur. It is clear, however, that there is something 
missing from this story. In reality, molecules have a finite 
diffusion rate. Molecules have a finite lifetime before they 
are broken down or destroyed. Only molecules relatively 
close to one another on the surface of a planet have any 
chance of interacting with one another. There must be 
a length scale controlled by diffusion above which dif- 
ferent regions of space are effectively independent of 
one another on the time scale of the molecular lifetime. 
The assumption that the system is well mixed breaks 
down at larger lengths scales than this. Our aim in this 
paper is therefore to study two-dimensional spatial models 
(representing the surface of a planet) in which reactions 
occur locally and there is a finite rate of molecular diffu- 
sion. We will show that the concentration fluctuations 
that lead to the origin of life can occur locally in any one 
region of space. Once the stochastic transition has oc- 
curred in one place, the living state can then spread 
deterministically across the rest of the surface. 

Results 

A generic phase diagram for replicating molecules 

In this section, we briefly summarize our RNA poly- 
merization model [26] and show how the phase diagram is 
obtained in this case. We then introduce a very simple 
generic model for replicating molecules and show that the 
phase diagram is equivalent. 

In the RNA system, we suppose that precursor "food" 
molecules are available in the environment at concentra- 
tions Fj and F 2 . We suppose that monomers, denoted 
by A, can be synthesized from i 7 1> with rate constant s. 
These monomers can react with F 2 to produce activated 
monomers, A*, with rate constant a. RNA polymers of 
length n are denoted A n . An activated monomer can 
react with a polymer to extend its length by 1 with rate 
constant r. All molecules can escape from the system at a 
rate u. If the system is well-mixed, it can be described by 
the following ordinary differential equations: 

dA 

— = sF x - aF 2 A - rAA* - uA (1) 
dt 

dA 

-^ = rA*{A n „ l -A n )-uA n (2) 
dA* 

— = aF 2 A - rA*{A +P)- uA* (3) 
dt 

In Equation 3, P is the total polymer concentration 
of all lengths: P = 0 A n . We assume that there is a 
minimum length m above which polymers have the 
possibility to act as catalysts for the polymerization step. 



The concentration of polymers of at least length m is 
P m = ^^ n>m A n < The polymerization rate constant is 

r = r Q + kP m , (4) 

where r 0 is a small spontaneous polymerization rate and 
kP m is the term due to ribozyme catalysts. The constant 
k is the catalytic efficiency of the long polymers, which 
may be written as k = k(f, where / is the fraction of the 
long polymers that function as catalysts and k 0 is the 
catalytic efficiency of a functional catalyst. 

We refer to Eqn. 4 as the feedback equation, because 
it says that the presence of catalysts feeds back into the 
polymerization rate and increases the rate of formation 
of more catalysts. The stationary states of the molecular 
concentrations can be obtained from Equations 1-4 
using the methods described previously [24]. The key 
point is that there is a dead (or non-living) state in 
which P m is very small and r ~ r 0 , so that polymerization 
occurs at the spontaneous rate, and there is a living state 
in which kP m >> r 0 , so that polymerization is autocata- 
lytic and occurs at a much higher rate than the spontan- 
eous rate. Figure 1 shows a phase diagram as a function 
of k and r 0 . We are most interested in the region where 
both states are stable. The bistable region occurs if r 0 is 
fairly small and k is fairly large. If k is too small, only the 
dead state is stable, and if k is too large, only the living 
state is stable. Furthermore, if r 0 is large then there is 
only one stable state for all values of /c, and this changes 
smoothly from dead to living as k is increased across the 
dotted line in Figure 1. An example of this phase dia- 
gram for different parameter values has previously been 
given [26] . We have also considered the case where feed- 
back is in the monomer synthesis rate [27] and have 
extended the model to deal with the possibility of chiral 
monomers [28]. Bistable regions were also found in 
these cases. A bistable region equivalent to ours was 
observed in related models for autocatalytic polymers 
[29-31], although no explicit phase diagram was drawn. 
When designing our original model, we were aware of 
the toy model for the origin of life by Dyson [32]. This 
model is described in a very abstract way with no expli- 
cit chemical reaction equations. Nevertheless, the central 
ideas of living and dead stable states and a stochastic 
transition between these states exist in this model, and 
the phase diagram of Dyson s model also has a bistable 
region similar to ours. 

Having seen that the kind of phase diagram in Figure 1 
occurs in many different models, we now wish to intro- 
duce the simplest model we can that has the same phase 
diagram. The simplified model will then be useful to in- 
vestigate the dynamic properties of the stochastic tran- 
sition to life in the following sections of this paper. 



Wu and Higgs Biology Direct 2012, 7:42 
http://www.biology-direct.eom/content/7/1/42 



Page 4 of 1 5 



Z 3 



1 1 


i 


- 


Living Only 


Living + Dead 




i i i i 



0.02 



0.08 



0.1 



0.04 0.06 
Spontaneous Polymerization Rate r 0 

Figure 1 Phase diagram for the RNA polymerization model as function of catalyst feedback efficiency k and spontaneous 
polymerization rate r 0 with the other parameter set as: m=5, s=10, a=10 and u=\. 



We consider a single type of replicating molecule 
whose concentration is cp. Replicators can appear at rate 
s, representing a slow rate of spontaneous synthesis by 
random polymerization. This process is independent of 
the replicator concentration. Existing replicators may be 
copied with a rate constant r, representing a process of 
non-living template-directed synthesis. This process is 
proportional to the current replication concentration, 
process is proportional to the current replication cp. 
Replicators may also act as polymerases that catalyze 
replication by using another replicator as a template. 
The rate constant for this catalytic process is k and this 
process is proportional to the square of the current con- 
centration, cp 2 . The increase in replicator concentration 
is limited by finite resources. The simplest way to model 
this is to assume a finite carrying capacity of the system, 
corresponding to a concentration cp = 1, and to multiply 
all the growth rates by a factor (l-<p). Finally, replicators 
die (or are destroyed or removed from the system) at a 
rate u. The deterministic dynamical equation for cp is: 



dip 
dt 



(s + r(p + k(p 2 )(l - cp) 



ucp 



(5) 



In many models of population dynamics in evolution 
and ecology, the linear growth term, rep, is natural, and 
the spontaneous term, s, and nonlinear term, kep 2 are 
not considered. However, when considering the origin of 
life, the spontaneous term is essential for generation of 
the initial replicators, and the nonlinear term is essential 
in order to give the possibility of two stable states. The 
linear term is less important; therefore we will first 



consider the case where r = 0. The stationary states are 
the roots of this cubic equation. If s is small and k is 
fairly large, there are two stable states, cp<^- j and (p 2 , 
separated by an unstable state cp 3 . The lower stable state 
is a dead state, controlled by the balance between spon- 
taneous generation and death: cp~s/u. The upper stable 
state is a living state controlled by the catalytic term k. 
If k is large, the concentration will approach the carrying 
capacity, <p 2 ~l. It is easy to obtain phase diagram for 
existence of dead and living states as in Figure 2. This 
has the same regions as Figure 1. If the linear growth 
term r, is non-zero, the positions of the phase boundaries 
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Figure 2 Phase diagram for the simple replicator model 
(Eqn. 5) as function of catalytic feedback efficiency k and 
spontaneous synthesis rate constant s with u = 1. Boundaries 
are shown for three different values of r. 
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move, but shape of the diagram is qualitatively un- 
changed (also shown in Figure 2). We therefore argue that 
this kind of phase diagram is a generic feature of replicator 
models for the origin of life, and that Equation (5) is the 
simplest possible dynamical equation that has this phase 
diagram. 

The two's company model 

The main aim of this paper is to consider the influence 
of spatial concentration fluctuations and limited diffu- 
sion rate on the stochastic transition that leads to life. 
Therefore, we will now define a spatial lattice model, 
which we call the Twos Company Model, that is a 
spatial version of Eqn. 5, and reduces to Eqn. 5 when the 
system is well mixed. 

We consider a 2D square lattice of LxL sites. The 
number of molecules on any one site may be n = 0, 1, 2 
or 3 only. The carrying capacity of the whole lattice is 
therefore 3L 2 . If there are N molecules on the lattice, the 
mean concentration relative to the carrying capacity is 
cp = N/(3L 2 ). There are 3-n vacancies on a site with n 
molecules. Vacancies are treated as resources, and the 
rates of spontaneous, linear, and replicative growth are 
all proportional to the number of vacancies. The rate of 
adding a molecule by the spontaneous reaction is 
defined as s times the number of resources, i.e. s(3-n). 
The rate of adding a molecule by the linear growth 
process is defined as r/2 times the number of molecules 
times the number of resources, i.e. rn{3 - n)l2. This is 
equal to r when n = 1 or 2, and zero otherwise. The 
rate of catalytic replication is defined as k/2 times the 
number of ways of picking a replicator-template pair 
times the number of resources, i.e. n{n - 1)(3 - n)k/2. 
This is k when n = 2 and zero otherwise. The replication 
process in the model can only occur when n = 2. "Twos 
company, three s a crowd"; hence the name of the model. 
Combining the three birth processes, we obtain the 
total birth rate of molecules on a site with n molecules, 
as summarized in Figure 3. The death rate of a molecule 
on a site with n molecules is un (i.e. u per molecule). 

Hopping of molecules between sites is implemented 
using either local or a global hopping rules in the fol- 
lowing way. Each molecule attempts to hop at rate h. A 
destination site is chosen for the molecule. In the case 
of local hopping rules, the destination is chosen at 
random from the 8 neighbouring sites of the original site. 
In the case of global hopping rules, the destination site 
is chosen at random from all the other sites on the lat- 
tice. The molecule hops successfully to the destination 
site if it finds a vacancy there, i.e. the hop is successful 
with probability l-w/3 (as shown in Figure 3). If the hop 
is unsuccessful, the molecule remains on its original site. 
Further details of the stochastic dynamics used to imple- 
ment the model are given in the Methods section. 
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Figure 3 (a) Transition rates in the Two's Company model due 
to birth and death events on a single site, (b) Each molecule 
attempts to hop to a neighbouring site at rate h and is successful 
with a probability equal to the fraction of vacancies on the 
neighbour. 



The local hopping rules simulate local molecular dif- 
fusion, which is an important part of this model. The 
global hopping rules are intended as a comparison that 
helps to illustrate the importance of local hopping. The 
introduction of local hopping rules means that correla- 
tions exist between the numbers of molecules on neigh- 
bouring sites. However, if h is sufficiently large, the 
system reaches a well-mixed limit where molecules are 
randomly positioned over the whole lattice and there is 
no remaining correlation between sites. In the case of 
global hopping, there is no correlation between sites, 
even when h is small. Below, we will discuss the mean 
field approximation, which is exact for the global hop- 
ping dynamics but it is only an approximation for the 
model with local hopping dynamics. In the Methods sec- 
tion, we show that when h is very large, both global and 
local hopping models reach the same well-mixed limit, 
which corresponds to the generic replicator model in 
Equation 5. 

Simulations are initiated in the dead state and followed 
until a transition to the living state occurs. The concen- 
tration in the dead state should be close to the stable so- 
lution cp 2 of Equation 5. In order to initiate the stochastic 
simulation in the dead state, each lattice site is seeded 
with molecular numbers sampled from a binomial distri- 
bution with average concentration cp 2 . In a typical simu- 
lation, the system remains in the dead state for a long 
time until a localized patch of high concentration arises 
that is sufficiently large to be stable in the living state. 
The origin of this living patch is a rare stochastic event. 
However, once it is formed, the living patch then spreads 
deterministically across the whole lattice. 

A snapshot of a simulation shortly after the transition 
to life is shown in Figure 4. Molecules that were 
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Figure 4 Snapshot of a simulation of the Two's Company model shortly after the transition to life. The living state is a dense patch of 
replicators that have been synthesized catalytically (red). The non-living state is characterized by a low density of replicators created by 
spontaneous synthesis (coloured grey) with small isolated denser patches. Once the living patch is big enough to be stable, it spreads 
deterministically across the lattice. 



synthesized by the spontaneous process are coloured 
grey, and molecules that were synthesized by the catalytic 
process are coloured red. These colours are used for 
illustration, but all the molecules have identical proper- 
ties. In Figure 4, a living patch has arisen in one region. 
Within this region, there is a high concentration (most 
sites have n = 2 or 3), and most molecules are red, indi- 
cating that this patch is sustained by catalytic replication. 
The rest of the lattice is still in the dead state, where 
there is a low concentration (most sites have n = 0 or l), 
and most molecules are grey. Small clusters of red mole- 
cules are nevertheless visible within the dead region. 
These clusters appear and disappear rapidly. The origin 
of life occurs when a cluster of this kind becomes large 
enough to be stable and to continue to grow. If the 
spontaneous rate 5 is suddenly switched to zero in the 
configuration shown in Figure 4, then the isolated mole- 
cules in the dead state rapidly disappear, but the living 
patch remains virtually unaffected and continues to 
spread. The spontaneous process is not necessary to sus- 
tain the living state because the living state is autocata- 
lytic. Of course, the spontaneous rate must be non-zero 
in order to allow the transition to the living state in the 
first place. 



We performed many simulations of this model in 
order to measure the way the time taken for the transi- 
tion depends on the parameters. We define T sys = T reg + 
T sp read> where T reg is the time until a local transition 
occurs in any one region of the lattice, and T spread is the 
time for the living state to spread from one region across 
the rest of the lattice, and T sys is the time required for 
the full lattice to reach the living state. These times were 
measured as described in the Methods section. 

Figure 5 shows the average T sys as h is varied with a 
fixed system size. A comparison of the transition times 
with local and global hopping rules is shown for a small 
lattice with L = 10. For large h the measured times for 
local and global hopping converge, showing that the sys- 
tem is well mixed. For smaller /z, the transition time with 
local hopping is always shorter than that with global 
hopping. This shows that the local concentration fluc- 
tuations that arise in the local hopping model make the 
transition much easier. It can be seen that if h is very 
small, the transition time becomes very long, both for 
local and global hopping. It should be remembered that 
the spontaneous generation rate 5 is very small, so the 
concentration of molecules in the dead state is very 
small. For the catalytic process to occur, it requires two 



Wu and Higgs Biology Direct 2012, 7:42 
http://www.biology-direct.eom/content/7/1/42 



Page 7 of 1 5 



10 6 



10 5 



10 4 



f 10 3 

8. 

TO 

1 ,0*t 



10 1 

Figure 

system 




Local Hopping with L=10 - 
Global Hopping with L=10 - 
Local Hopping with L=100 - 



10 1 



10 2 

Hopping Rate h 



10 3 



5 Effect of hopping rate on transition time for fixed 
size with other parameters set as: s=0.02, r=0, k=9. 



10 8 
10 7 r 
10 6 " 

E 

F 10 5 
o 

1 10 4 
I 10 3 

10 2 

10 1 



10° 



10 1 



10 2 



Local Hopping System Transition Time h=45 - 
Local Hopping Region Transition Time h=45 - 
Global Hopping Transition Time h=45 - 
Local Hopping System Transition Time h=24 
Local Hopping Region Transition Time h=24 




10 3 10 4 

System Size L 2 



10 5 



10 6 



Figure 6 Average transition time from dead state to living 
state as function of system size with parameters set as: s=0.02, 
r=0, k=9, u=1. The straight line illustrates a scaling of 1/L 2 . 



molecules on the same site. If h is too small, molecules 
do not encounter one another frequently. Furthermore, 
if molecules do encounter one another and a replication 
occurs, there will now be three molecules on one site, 
which blocks further replication. It is necessary for one 
of these molecules to hop away before a second replica- 
tion can occur; hence, h should not be too small. It can 
be seen that, for the local hopping rules, there is an 
optimum diffusion rate at which the transition is fastest. 
This effect is not very strong in the small lattice (L = 10), 
but is very pronounced in the larger lattice (L = 100) also 
shown in Figure 5. The transition time decreases by 
several orders of magnitude in the middle of the range 
of h. It is not possible to show the transition time for the 
global hopping case for the larger lattice size because it is 
too long to measure accurately in the simulation with 
these parameters. In summary, Figure 5 shows that local 
diffusion is essential in order to allow the transition to 
life to occur at a reasonable rate in large systems, and 
when diffusion is local, there is an optimal intermediate 
rate of diffusion. 

Figure 6 shows the variation of transition time with 
system size when the hopping rate is kept fixed. For the 
global hopping case, it is only possible to do these simu- 
lations for fairly small lattice sizes, because the transition 
time increases very rapidly with L. For the local hopping 
case, it is possible to carry out simulations over a much 
wider range of L, and it can be seen that the transition 
time actually decreases with L, if L is sufficiently large. 

As shown in left part of Figure 6, when the system is 
small, it is more or less well mixed. Hence the transition 
will happen globally across the whole system at the same 
time. When the system is larger, it is no longer homoge- 
neous. The transition will happen at a local region first, 
then it will spread out to the whole system. The number 
of independent local regions should be proportional 



to L 2 . The average time taken for the transition to occur 
in any one of these regions should vary inversely with the 
number of regions, i.e. T reg ~ 1/L 2 . If the spread of the 
living state is rapid, the spreading time, T spread is small 
compared with the time taken for the transition to occur, 
T reg . In this case the time taken for the whole system to 
go through the transition is close to T reg . It can be seen 
that there is an intermediate range of L in Figure 6, 
where T reg and T sys are very close to one another and 
where both scale as 1/L 2 , as expected from this argu- 
ment. Comparison of the results with h = 24 and h = 45 
shows that when diffusion is slower, the transition from 
homogeneous system to heterogeneous system happens 
for a smaller system size. 

If the system size is extremely large, there comes a 
point where T spread is comparable to T reg , and the curves 
for T sys and T reg separate in Figure 6. At this point it is 
possible for the transition to life to occur in a second 
region independently before the living state from the first 
transition has spread across the lattice. Beyond this point, 
T reg continues to decrease, but T sys reaches a constant 
value, which is what we expect if there are multiple ori- 
gins happening in different places. 

A useful way to understand the effects of the limited 
diffusion rate in this model is to calculate the mean field 
solution. In the Methods section, we calculate the prob- 
ability P n that there are n molecules on a site, using a 
mean field approximation that ignores correlations be- 
tween neighbouring sites. The mean field solution 
depends on h and it is not equivalent to the well-mixed 
case, except for the limit of large /z, where both the lat- 
tice model and the mean field approximation converge 
to the well-mixed case. We expect that the mean field 
approximation will be fairly good for the local hopping 
model if h is large, because correlations between neigh- 
bouring sites should be small in this limit. For the global 



Wu and Higgs Biology Direct 2012, 7:42 
http://www.biology-direct.eom/content/7/1/42 



Page 8 of 15 



hopping model, the mean field solution is exact, even for 
small h. 

In Figure 7, we compare the solutions of the mean 
field equations with dynamical simulations. The black 
curve is the homogeneous solution from the roots of the 
cubic Equation 5. This is the well mixed limit of the 
spatial model The solid black lines illustrate the stable 
living and dead states and the dashed line is the unstable 
intermediate state. The red, green and blue lines show 
the stable and unstable solutions of the mean field equa- 
tions with three values of h. For large /z, the mean field 
solution is close to the well-mixed limit, as expected. 
The symbols show average concentrations of molecules 
obtained from simulations of the lattice model with local 
hopping. For the living state, the mean field theory 
gives a fairly good approximation to the result with local 
hopping, even for the smaller values of h. This is because 
the density is quite high across the whole lattice in the 
living state and correlations between neighbouring sites 
are quite weak. On the other hand, there are large devia- 
tions between the mean field theory and the results of 
the local hopping model in the dead state when it is close 
to the bistable region, which shows that there are import- 
ant local spatial correlations in this case. 

Figure 7 illustrates that reduction in the local diffusion 
rate has both positive and negative effects on the replica- 
tion process. As h is reduced, the concentration of the 
living state is reduced because there is more local inter- 
ference between molecules on fully occupied sites. Also 
the minimum value of k required for the living state to 
be stable increases as h decreases. Thus, rapid diffusion 
is favourable for the living state, if the living state is 
already established. However, we have already seen in 
Figure 5 that high diffusion rates cause the system to be 
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well mixed, in which case the transition time is extremely 
long. It is therefore important to have a diffusion rate 
that is not too large in order for the transition to the 
living state to occur at an appreciable rate in the first 
place. 

A spatial model for RNA polymerization 

The Twos Company model is the simplest model that 
can be used to study the stochastic transition to life in a 
spatially distributed system. However, we wish to show 
that the same behaviour occurs in other models. We will 
return to the model of RNA polymerization described in 
Equations 1-4 in order to study the effects of diffusion 
and spatial concentration fluctuations in this case. 

The simulation is carried out on 2D square lattice 
composed of L x L sites, with each site defined to have a 
size 1=1. We keep track of how many molecules of each 
kind there are on each site. The reactions of monomer 
synthesis, activation, and polymerization all occur locally 
on individual lattice sites. The implementation of sto- 
chastic dynamics for this model using the Gillespie 
algorithm is described in our previous paper [26] for a 
non-spatial model. In the spatial case, the same method 
is used to define reaction rates on each site as a function 
of the numbers of molecules of each type on each site. 
For simplicity, molecules of all kinds are assumed to hop 
with the same rate h. Hopping may either be local 
or global, as with the Twos Company model. However, 
there is no need to impose a maximum number of mole- 
cules per site, because concentration is limited by food 
input rather than by a carrying capacity. Therefore every 
attempted hop is successful, and hops are not blocked 
by the molecules on the destination site, as they are for 
the Twos Company model. 

Figure 8 shows snapshots of the distribution of cata- 
lytic polymers across the lattice. Three time points are 
chosen in order to illustrate the way the catalyst concen- 
tration increases over time. As shown in first row of 
Figure 8, when system is small (L=30), the system is 
fairly well mixed. Hence the transition happens across 
the whole system at the same time. When system is 
larger (L=100), the system is no longer homogeneous. 
The transition happens at a local region first, then it 
spreads out to the whole system as shown in second row 
of Figure 8. When the system is very large (1=1000), the 
time it takes to spread across the whole system is longer 
than the regional transition time. Hence, multiple origins 
occur, as in third row of Figure 8. 

When the transition occurs, the increase in the catalyst 
concentration is accompanied by a sharp decrease in the 
activated monomer concentration. As there are many 
more activated monomers than catalysts, the activated 
monomer concentration varies more smoothly than the 
catalyst concentration. Therefore, we used the decrease 
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Figure 8 Snapshots of the catalytic polymer concentration in the RNA polymerization model at the beginning^), during(f 2 )/ and after 
(f 3 ) the transition from the dead state to the living state for different system sizes (L=30, 100, 1000) and with local hopping rules. 

Parameters are set as: m=5, 5=10, a=10, r 0 =0.02, k=2, u=\, h=50, f=1. 



in the activated monomer concentration as a marker for 
the transition time. This concentration was averaged over 
a sliding time interval, and this was used to define re- 
gional and system transition times, in the same way as 
for the Twos Company model (see Methods section). 

The transition time is shown as function of system 
size in Figure 9. For global hopping, the transition time 
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Figure 9 Average transition time from dead state to living 

state as function of system size with parameters set as: m=5, 

s=10, 0=10, r 0 =0.02, k=2, u=1, h=50, f"=1. The straight line 

illustrates a scaling of 1/L 2 . 
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follows a U-shaped curve, for the same reasons as for the 
non-spatial model of RNA polymerization (Figure seven 
of [26]). If the system is too small, polymers long 
enough to be catalysts do not easily arise, and if the 
system is too big, the concentration fluctuations are too 
small in the global case to allow the stochastic transition 
to occur. In contrast, when hopping is local, the transi- 
tion occurs easily for large system sizes. When the lattice 
is very small, the time for the transition with local hopping 
is close to that with global hopping, but when the lattice 
size is larger, the transition is much faster with local 
hopping. The transition time follows the same shape 
curve for this model as for the Twos Company model in 
Figure 6. There is an intermediate range where the tran- 
sition time scales as II L 2 and where T sys = T reg , and there 
is a range for very large lattice size where multiple origins 
occur and where T sys > T reg . 

Discussion and conclusions 

The models we have discussed here have features in 
common with various theoretical models discussed by 
other authors. There has been a lot of work on mainten- 
ance of replicators in the face of accumulation of muta- 
tions. The standard error threshold problem [33] deals 
with linear replicators in a non-spatial model. A spatial 
lattice version of this problem has been studied [33] and 
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this has been compared to the case of catalytic replica- 
tors in a spatial system [34] using a model similar to the 
Twos Company model The fundamental difference be- 
tween linear and catalytic replicators is that when a rep- 
licator acts on another sequence as template it is being 
altruistic. Evolution of catalytic replicators is therefore 
related to the evolution of cooperation. A large number 
of studies have shown that cooperation can evolve in 
spatial models when the equivalent non-spatial models 
would be invaded by parasites [35-43]. These models 
make it clear that formation of clusters of cooperating 
molecules and formation of spatial patterns is generally 
very important for evolution and survival of catalytic 
replicators. However, none of these papers dealt with the 
origin of the replicating system. Our work also shows 
that incorporation of local diffusion and spatial concen- 
tration fluctuations is essential to enable the origin of life 
to occur with an appreciable probability. Other models 
[44-46] that deal with prebiotic replicators and consider 
the transition to a replicating state are closer to our 
work. Spatial effects are also apparent in these models. 
Another approach to the origin of the RNA world [47] 
uses very complex sets of reaction equations for the 
emergence of RNA replication and also makes a link 
with nucleation phenomena. Models for the origin of 
replication [29-31] have been studied previously that are 
very similar to ours, however the stochastic dynamics 
and spatial effects were not looked at by these authors. 

Despite important differences in their details, the 
Twos Company model and the RNA polymerization 
models exhibit surprisingly similar behaviour, both for 
the phase diagram (Figures 1 and 2) and the dynamics of 
the transition to life (Figures 6 and 9). The similarity 
arises because the models share the features of nonlinear 
feedback and limited resources. For the Twos Company 
model, the replication rate of the molecule depends non- 
linearly on the concentration of the molecule because of 
the kep 2 term, and limited resources are modelled by the 
rule that there can be no more than 3 molecules on the 
same site. For the RNA polymerization model, the nonli- 
nearity arises because of the feedback in polymerization 
(Eqn. 4) because of the way the ribozyme concentration 
depends on the polymerization rate (see [26]). The lim- 
ited supply of food molecules from the environment is 
included in the model, and this prevents the ribozyme 
concentration increasing indefinitely in the living state. 
Given these two features, both models have two stable 
states and a bistable region in the phase diagram. 

The two models exhibit essentially the same stochastic 
transition arising as a result of concentration fluctua- 
tions. When the system is well mixed, fluctuation will 
decrease with increasing system size due to increasing 
number of molecules in the system. Hence the transition 
becomes increasingly difficult for larger system sizes. 



With limited diffusion, the system is no longer well 
mixed and breaks down into many weakly interacting 
regions. The larger the system, the more such independ- 
ent regions there are in which the transition can occur. 
Therefore the transition becomes faster for larger sys- 
tems in a 2D model with limited diffusion. 

Both models illustrate that multiple origins of life 
could occur if the system is large enough. As far as we 
know, however, all current life on Earth is descended 
from a single origin. The size of the Earth is very much 
larger than the distance over which a molecule could 
easily diffuse. However, the window of time during 
which life arose is narrowed to a few hundred million 
years by geological evidence [48]. If the origin of life is a 
rare event taking hundreds of millions of years, it is still 
perfectly possible that it occurs only once (or not at all) 
on a large planet. Once the living state has arisen, the 
time for multiplication and spread across the planet is 
determined by molecular time scales (replication and 
diffusion) which are likely to be very much faster than 
the waiting time for the rare event that creates life in the 
first place. In other words, it is likely that the real world 
falls in the intermediate part of Figure 9, where the tran- 
sition time varies like 1/L 2 and there is a single origin. 
However, the scenario involving multiple origins could 
still be compatible with the view of single tree of life if 
there were competition between different forms of life 
and only one remained, or if there were symbiosis and 
merging of features that derived from different origins. 

Two parameters that are important in the RNA 
polymerization model are m, the minimum sequence 
length required to be a catalyst, and f, the probability 
that a sequence of length n > m is a catalyst. In this 
paper, we ran many simulations for each set of para- 
meters in order to study the way the mean transition 
time depended on h and L. For this reason, it was neces- 
sary to choose parameters for which the transition oc- 
curred fairly easily. Therefore, we made the length short 
(m = 5) and chose the easiest case where / = 1. In our 
first paper, we looked at much longer ribozymes (m = 50) 
with / much less than 1, which seems more realistic if 
we are looking for ribozymes that would have specific 
sequences and structures, and the essential behaviour is 
very similar. In reality, we do not know how long the 
first catalytic sequences might have been. We are used 
to thinking of enzymes hundreds of amino acids long in 
current organisms, and currently known polymerase 
ribozymes are all well over 100 nucleotides long [7,9,14]. 
Hence, we tend to think of biopolymers as rather long 
and very specific in their sequence. It seems reasonable 
to suppose that the earliest catalysts must have been 
much shorter and less sequence specific if they were 
to have had any chance of arising in prebiotic condi- 
tions. Short sequences would be relative common, and 
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relatively easy to replicate even if the fidelity was quite 
low. These are important advantages of short sequences, 
even if they have only very limited catalytic ability com- 
pared to much longer, well-adapted ribozymes. 

In all our models of RNA polymerization, we have 
made the simplification of ignoring nucleotide sequences 
and keeping track only of polymer length. It is supposed 
that polymerase ribozymes catalyze all polymerization 
reactions equally. Hence, they will catalyze the formation 
of random non-functional sequences in addition to in- 
creasing their own formation rate. This would be ex- 
tremely inefficient, and it seems likely that real ribozymes 
could have done better than this by being more specific. 
Nevertheless, an autocatalytic living state arises even in 
this simple non-specific case, and there is a clear differ- 
ence between the living and the dead state, which is the 
aspect of the model we wish to emphasize. The case of 
non-specific polymerases has sometimes been called 
"pre-life" [29-31], and the term "life" has been reserved 
for a specific polymerase that catalyzes only the steps that 
lead to formation of its own sequence. Such a specific 
polymerase would have been much more efficient, as it 
would not waste its time making non-functional polymers. 
However, it is difficult to see how a specific polymerase 
would distinguish between the polymerization steps that 
contribute to its own formation and those that do not. In 
fact, polymerases in modern organisms are general, not 
specific. They can synthesize any general sequence, but 
they make specific sequences because they make use of 
templates, not because they distinguish between the reac- 
tion steps that link different kinds of monomers. 

Of course, not all sequences are equivalent in the real 
world, and whether a molecule can function as a catalyst 
depends on its sequence and structure. It would be pos- 
sible to consider a model for polymerization of four 
nucleotides in which the base sequence of each polymer 
in the mixture is stored in memory. One could then de- 
termine whether a sequence was a catalyst using criteria 
based on secondary structure or whether it contained a 
specific sequence motif within it. This would be more 
realistic than simply giving each sequence a probability 
/ of being a catalyst, as in our current model, but the 
essential point that the formation of catalysts causes a 
feedback in the polymerization rate would remain un- 
changed in a more complex model that considered RNA 
sequence and structure in more detail. However complex 
and realistic we try to make a model, there will always be 
factors that must be left out. In some cases, simplicity is 
a virtue in computational models if this helps to show 
the essential points clearly. We believe that the simple 
models studied here illustrate several essential features 
of the origin of life problem as it occurred in the real 
world, i.e. the origin of life is a transition between two 
alternative steady states of a chemical reaction system 



that is driven by stochastic concentration fluctuations 
involving relatively small numbers of molecules in a loca- 
lized region of space. 

Methods 

We simulated the stochastic dynamics of the Twos 
Company Model using the Gillespie algorithm. At each 
time step, one elementary reaction was chosen at random 
with a probability proportional to its rate. The duration 
of the time step was a random time r, chosen from an 
exponential distribution P(r) = Re~ RT , where R is the 
sum of the rates of all the elementary reactions that 
could occur at that point in time. 

Simulations were performed to measure the mean 
time until the transition to the living state occurred. The 
simulations were stopped when the global concentration 
became close to cp 2 , the solution of Equation 5 corre- 
sponding to the living state. In order to insure that the 
system was stable in the living state, cp was averaged over 
a sliding time interval of width 20, which is 20 times the 
average life time of a molecule. The simulation was 
stopped when the sliding average density reached 90% 
of cp 2 . This time was then recorded as the system tran- 
sition time of one trial. The average system transition 
time, T sySt was obtained from 1000 trials for each set of 
parameters. When the local hopping rule is used, the 
transition happens at one place then spreads across the 
whole system. We may write T sys = T reg + T spread , where 
T reg is the time until a local transition occurs in any one 
region, and T spread is the time for the living state to 
spread from one region across the rest of the lattice. 
The system was divided into multiple regions of size 
10x10 lattice sites. The concentration was calculated 
within each region and averaged over a time window in 
the same way as the global concentration. T reg was 
defined as the point at which the sliding window concen- 
tration of any one region reached 90% of cp 2 . 

We will now demonstrate that the well mixed limit of 
the lattice model corresponds to the generic replicator 
model in Equation 5. The total birth rate of molecules, 
a m on a site with n molecules is 

do = 35 

a\ = 2s + r (6) 
a 2 = s + r + k 

The death rate of a molecule on a site with n mole- 
cules is nu. Let P n be the fraction of sites that have n 
molecules. The rate of change of N due to birth and 
death processes is 

^ = L 2 (a 0 P 0 + {a x - u)P x + (a 2 - 2u)P 2 - 3uP 3 ) 
at 

(7) 
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Hence 
dep 1 
dt ~3 



(a 0 P 0 + (a x - u)Pi + (a 2 - 2u)P 2 - 3uP 3 ). 



(8) 

Note that these equations do not depend on the hop- 
ping rate because the hopping process does not change 
the total number of molecules. However, the P n prob- 
abilities do depend on h. If both N and L are large and h 
is large enough to be in the well-mixed limit, where 
molecules are randomly distributed between lattice sites, 
P n has a binomial distribution: 

3! 



P n = <p n (l - <p) 



3-n 



n\(3-n)\ 



(9) 



Substitution of these probabilities into Equation 8 gives 
Equation 5. Thus, the lattice model reduces to the desired 
continuum model if there is rapid mixing of molecules. 

We now consider the rates of gain and loss of mole- 
cules due to molecules hopping to and from a site with 
n molecules. Let q nm be the conditional probability that 
there are m particles on the neighbouring site, given that 
there are n on the first site. The rate that n increases by 
1 due to a molecule arriving from a neighbouring site is 

Pn = %*i + 2a n2 + 3#„ 3 ) (l - |) = 3(l - |) *<P«, 

(10) 

where cp n = (q nX + 2q n2 + 3q n3 )l3 is the mean density on 
sites that are neighbours of sites with n particles. Note 
that there are 8 neighbours from which a molecule could 
come, but only 1/8 of the hops from any one neighbour 
move to the site in question. Therefore these two factors 
of 8 cancel out. 

The rate at which n decreases by 1 due to a molecule 
leaving the site is 

Pn = hn (Jl- + (l ~ ^)#*2 + (1 - 1)#h3 

= hn(l - cp n ), 

(11) 

We may now write a set of equations for the rates of 
change of P n that include birth, death and hopping. 



dPo 
dt 

dPi 
dt 



dP2 
dt 



dPs 
dt 



- -a 0 P 0 + uPi +p 1 P 1 - p^P 0 

a 0 P 0 - a x P x + u(2P 2 - P x ) + p~P 2 - p\P x 
-PiPi+ptPo 

a x P x - a 2 P 2 + u(3P 3 - 2P 2 ) + p'P 3 
-ptP 2 -p 2 P 2 +piPi 
a 2 P 2 - 3uP 3 - p^P 3 + p\P 2 



This set of equations is not closed, because the hop- 
ping rates depend on the conditional probabilities q nm . 
A standard way of approximating the solution to lattice 
models is to make a mean field approximation that 
ignores correlations between neighbours. We assume 
that q nm = P m , in which case: 



^>n = <P 



1 2 

-Pi +-P 2 

3 3 



(13) 



With this approximation, we get the following equa- 
tions, which are a closed mean field approximation to 
the dynamics. 



dPo 
dt 

dPi 
dt 

dP2 
dt 

dPs 
dt 



-3sP 0 + uP x + h((l - (p)Pi - 3cpP 0 ) 

3sP 0 - (2s + r)Pi + u(2P 2 - Pi) 

+ h(2(l - cp)P 2 - 2(pP x - (1 - cp)P 1 + 3cpP 0 ) 
(2s + r)Pi - (s + r + k)P 2 + u(3P 3 - 2P 2 ) 

+ h(3(l - <p)P 3 - cpP 2 - 2(1 - cp)P 2 + 2cpP x ) 



(s + r + k)P 2 - 3uP 3 

+ h(-3(l - cp)P 3 + cpP 2 ) 



(14) 



(12) 



The mean field solutions shown in Figure 7 are the sta- 
tionary states of these equations. The simulation with glo- 
bal hopping rules follows the mean field solution exactly 
because there are no correlations between sites when hop- 
ping is random. The mean field solution is an approximate 
solution for the case with local hopping rules. 
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address the conditions under which a living state can emerge from a non- 
living state, and find that a spatially extended system with moderate 
diffusion can greatly facilitate the emergence of the living state. Most 
importantly, in this regime, the probability of emergence scales with the size 
of the system, so that the emergence of life somewhere is virtually 
guaranteed for sufficiently large systems. Overall, this is a very nice 
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topic towards the end of their paper, and I'd like to see the arguments 
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the origin of life happen within, say, a billion years? What would the 
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folding to the same secondary structure would have had similar functions, 
which makes f larger. This argument also suggests that it may be easier to 
begin with rather short ribozymes with a rather low catalytic ability than with 
longer, but rarer sequences that are better catalysts. These questions are clearly 
important, but we do not know how to put concrete numbers on these 
quantities. To give a quantitative answer to the time scale and required reaction 
rates, we would need to the concentrations of reagents on the primitive Earth, 
the reaction pathways by which nucleotides and RNAs were synthesized, and 
the temperature, pressure and pH at which these reactions were occurring. 
Current knowledge of these things is very incomplete. 

Comment: I think the manuscript would improve if it had a separate 
Methods section containing some of the more technical details of the 
simulations and derivations. Section 4 in particular mixes results with highly 
specific implementation details that don't move the story forward. Similarly, 
other sections contain material that doesn't directly impact the overall story, 
such as the proof that the Two's Company model agrees with Eq. (5). I'm not 
asking the authors to delete this material, just reorganize it such that the a 
reader can get the most important results without also having to read all the 
supporting evidence and materials. 

Response: A methods section has been added at the end of the paper and the 
technical sections have been moved to the methods. The separate sections on 
the Two's Company have now been merged into one main section. 

Reviewer 3: Nobuto Takeuchi, National Center for Biotechnology 
Information, National Library of Medicine, National Institutes of Health, 
USA. (Nominated by Eugene Koonin). 

Comment: In their manuscript, Wu and Higgs present a minimal replicator 
model that exhibits a stochastic transition from a "non-living" to a "living" 
state. Such a transition was originally proposed and investigated by Dyson in 
a different model [32,49]. Using this minimal model, the authors investigate 
the factors determining the waiting time for such a transition. In particular, 
the authors calculate the waiting time as a function of the diffusion rate of 
replicators. The calculation reveals that the waiting time is smallest when the 
diffusion rate assumes an intermediate value. This result is intuitively 
explained by the authors as follows: Too strong diffusion diminishes the 
stochasticity of dynamics and so impedes the transition, whereas too weak 
diffusion prevents replicators from encountering each other and, thus, from 
catalyzing each other's replication. Moreover, the authors examine the 
waiting time as a function of the system size. The calculation shows that the 
waiting time decreases as the system size increases, provided that the 
diffusion is finite. This result sharply contrasts with that obtained for infinite 
diffusion. The authors again give an intuitive explanation: If diffusion is finite, 
a non-living-to-living transition first occurs in a local region and then 
propagates through the entire system; thus, the larger the system, the 
greater the number of local regions (assuming finite diffusion), and so is the 
probability of a transition per unit time. Also, the authors check this result in 
a more complex model that considers RNA polymerization. 
The manuscript describes its content in a clear manner. I obtained only one 
specific question regarding the results (more general comments to follow). 
The transition time is shortest when the diffusion rate takes an intermediate 
value in the minimal replicator model. Does this result also hold in the RNA 
polymerization model in general? 

Response: Yes, although we have not fully investigated the regime of very low 
diffusion rate in the RNA polymerization model. Very low diffusion rate should 
be unfavourable because the catalysts can only be effective if monomers and 
growing chains encounter the catalysts and if newly created catalysts can 
spread beyond the site in which they were created. 

Comment: I have three general comments (and a question). First, the non- 
living-to-living transition exhibited by the models rests on the combination 
of the two processes: the stochastic occurrence of rare events and the 
subsequent deterministic amplification thereof. Interestingly, such a 
combination also plays an important role for the stability of a spatially 
extended RNA-like replicator system [40], Namely, the stable coexistence 
between catalysts and parasites is enabled by the formation of traveling 
wave patterns, whose emergence rests on the combination of the two 
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processes: rare stochastic events in which a few catalysts are spatially 
segregated from parasites and the subsequent expansion of these catalysts 
through replication (followed by infection with parasites). This kind of 
dynamics, in which rare stochastic events are deterministically amplified, 
might be a generic feature of (life-like) systems that have a finite system size 
and positive feedback (see also the last paragraph). 

Response: We agree that it is interesting to note that stochastic effects and 
limited spatial diffusion favour both the origin of life and the stable coexistence 
of replicators and parasites. 

Comment: My second comment is concerned with the implication of the 
current study in relation to Dyson's study [32,49]. Dyson's model is conceived 
under the metabolism-first scenario for the origin of life: it is created as a 
simplest model of metabolic systems and assumes no replicators. By 
contrast, the minimal replicator model investigated in the current study, of 
course, assumes replicators. The RNA polymerization model of Wu and Higgs 
is conceived under the RNA world hypothesis, but does not actually assume 
replicators (i.e., it does not assume template-directed polymerization). 
Despite these differences, all these models exhibit essentially the same 
general result: two stable states that are regarded as non-living or living and 
stochastic transitions between them. Therefore, the general result seems to 
be independent of whether the model assumes metabolism, replicators, or 
the RNA world. This independence is also implied by the reason why these 
models produce the same result. Wu and Higgs identify two features shared 
by their models that are essential for the general result: resource limitation 
and nonlinear (positive) feedback — the latter does not necessarily imply 
replication. These features are clearly shared by Dyson's model as well. Taken 
together, the results of the current study seems to imply that Dyson's 
proposal of identifying the origin of life as a stochastic transition between 
two stable states is orthogonal to whether one considers the metabolism- 
first scenario, the replicator-first scenario, or the RNA world hypothesis. 

Response: Although we like the idea of the stochastic transition to life that was 
introduced first by Dyson, we feel that Dyson was not clear on what was meant 
by the 'active' and 'inactive' states and why the active state corresponds to life. 
In both our RNA polymerization model and the Two's Company model, the high 
concentration of catalysts in the living state is maintained by the reactions 
carried out by the catalysts. The definition of the living state in our models is 
that it has autocatalytic biopolymers. Although we could presumably write 
down a metabolism-only model without biopolymers and without replication 
that would have two stable states and a possibility of a stochastic transition, 
this would be a purely chemical system and it would not constitute a transition 
to life, in our view, because the high-concentration state would not have 
replication, heredity and evolutionary potential. 

Comment: My last comment is concerned with the authors' treatment of the 
linear growth term in the minimal replicator model. In investigating this 
model, the authors regard this term as less important than the other terms. 
However, I find three arguments to challenge this view. First, the linear 
growth term can play an important, though negative, role for the reported 
results of the model. Namely, if the linear growth rate r is greater than the 
decay rate u, the model has only one stable state (Figure 2 shows results for 
r < u). Second, replicators whose growth is described by a linear term 
(exponentially growing replicators, for short) are theoretically the simplest 
possible replicators [50]. Moreover, they have been experimentally 
synthesized [12]. 

The last argument concerns the very core of the current study. The non- 
living-to-living transition conceived by the authors entails rare stochastic 
events and the deterministic amplification thereof. Such a transition might 
be exhibited by exponentially growing replicators as well if one abandons 
the presupposition that the non-living state must be dynamically stable, as 
follows. Let us suppose that RNA molecules are randomly synthesized as 
assumed in the RNA polymerization model. Moreover, some RNA sequences 
are capable of self-replication; however, these sequences form but a tiny 
subset of the sequence space. Now, the non-living state corresponds to the 
state in which the system hardly contains any self-replicating molecules; the 
living state, the state in which the system contains self-replicating molecules 
in abundance. The rare stochastic event corresponds to the emergence of a 
(few) self-replicating molecule(s) through random synthesis; the deterministic 



amplification, the expansion of the nascent population of these molecules 
through self-replication. This argument suggests a topic for further 
discussion: Must "non-living" states be dynamically stable? 

Response: If r > u in our Two's Company model, there will be no non-living 
state and there will be spontaneous multiplication of replicators without any 
waiting. This does not seem realistic to us. It is important to have non-linearity 
in some way in order to have two stable states. In the Two's Company model 
there is only one way to do this, which is the quadratic term representing a 
two-molecule replication process. However, in more complex models there are 
many ways to do it. In the RNA polymerization models, the nonlinearity can 
arise either by feedback in the polymerization rate or in the monomer synthesis 
rate [27]. It is true that the ligase reaction in [12] corresponds to an 
exponentially growing replicator, but this system only works if it is supplied with 
very specific complementary oligomer strands. This example does not seem very 
close to the r reaction in our model. For the r reaction, we envisage a template 
directed process (e.g. on the surface of a clay catalyst) in which a single strand 
acts rather passively as a template. Although the template ability of different 
RNAs might depend to some extent on the sequence, it seems likely that most 
sequences could act as a passive template to some degree. If the mineral 
catalyzed r process was already good enough to support exponential growth (r 
> u), then there would be immediate multiplication of large numbers of RNAs 
with no sequence specificity, which seems unreasonable. The reviewer's 
suggestion that the non-living state need not be dynamically stable makes 
sense as a logical possibility, but it seems unlikely to us, because the origin of 
life would just be too easy in this case. 

The possible interplay between template directed processes and ribozyme- 
catalyzed processes is interesting and is not fully captured in the models we 
studied so far. In our view, a non-specific r process could be important in 
generating diversity of RNA strands prior to life, but life itself would only 
arise when specific ribozymes came along to carry out the k process. Note 
that the ligase replicator in [12] is not acting as a template to specify the 
sequence of the new strand, because the sequence is already specified in 
the oligomer strands that are supplied to it. For an RNA strand to act as a 
catalyst it needs to fold to a specific structure. It is difficult to see how any 
strand could be a folded catalyst and an unfolded template at the same 
time. Therefore we are led almost inevitably to consider two-molecule 
replication processes. 

Reply to the authors' reply: Given the persistent differences in the definitions 
of life, it is all the more interesting that these models — conceived under 
different views of life — produce the same general result based on the same 
principle. The authors consider that there is unlikely to be a replicator that 
grows exponentially and requires specific sequence patterns to act both as a 
catalyst and as a template (I assumed such replicators when I suggested 
that the non-living state need not be dynamically stable). The ligase 
replicator of Lincoln and Joyce (2009) is an example of such replicators as 
elaborated below. Although it does not answer the question I raised (its 
growth requires substrates prepared by humans), it suggests great 
possibilities of what RNA molecules can do (it was generated by (only) six 
rounds of in vitro selection). 

Contrary to the authors' interpretation, the ligase replicator of Lincoln and 
Joyce (2009) does act as a template to specify the sequence of the new 
strand as follows. Their replicator consists of two ligases (denoted by E 
and E'), each composed of two substrates (denoted by A and B, and A' 
and B', respectively). The substrates A and B' contain sequences 
complementary to each other, and so do the substrates A' and B. Based 
on this complementarity, each ligase (say E) forms a complex with two 
substrates (A' and B') and catalyzes the ligation between them, 
synthesizing the other ligase (E'). In this way, a ligase acts both as a 
template and as a catalyst for the synthesis of the other ligase. More 
important, Lincoln and Joyce (2009) generated 12 sets of distinct 
substrates (denoted by An, Bn, A'n, and B'n where n ranges from 1 to 12). 
During a serial transfer experiment, base pair mismatches generate 
"recombinants" such as a ligase composed of A5 and B3 (denoted by 
E5,3). Such a recombinant can replicate; for example, an E5,3 forms a 
complex with A'3 and B'5 and catalyzes the synthesis of an E'3,5, which, 
in turn, catalyzes the synthesis of an E5,3. In this way, the replicator can 
transmit "genetic information", thanks to the ability to act as a template. 
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